Dynamic parameters in Sequential Decision Making

نویسندگان

چکیده

Sequential Decision Making (SDM) problems optimize over the sequence of actions (or, decisions) taken to minimize underlying cumulative cost. These are referred as policy SDM. Often these comprise additional (fixed and manipulable) parameters; objective is determine optimal well manipulable parameters that minimizes SDM In this paper we address class characterized by dynamic where dynamics pre-specified for a subset others. The parameter time-varying such associated cost gets minimized at each time instant. To end, develop control-theoretic framework design it tracks values parameters, simultaneously determines policy. Our methodology builds upon Maximum Entropy Principle (MEP) based addresses SDMs. More precisely, above results into smooth approximation which utilize control Lyapunov function. We show under resulting law asymptotically track local optimal, proposed Lipschitz continuous bounded, given set values. simulations demonstrate efficacy our methodology.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Teaching in Sequential Decision Making Environments

We describe theoretical bounds and a practical algorithm for teaching a model by demonstration in a sequential decision making environment. Unlike previous efforts that have optimized learners that watch a teacher demonstrate a static policy, we focus on the teacher as a decision maker who can dynamically choose different policies to teach different parts of the environment. We develop several ...

متن کامل

Convergence in a sequential two stages decision making process

We analyze a sequential decision making process, in which at each stepthe decision is made in two stages. In the rst stage a partially optimalaction is chosen, which allows the decision maker to learn how to improveit under the new environment. We show how inertia (cost of changing)may lead the process to converge to a routine where no further changesare made. We illustrate our scheme with some...

متن کامل

Sequential forecasting and decision making in dynamic and incomplete environments

In many real-world data analysis problems observations arrive sequentially in time and it is required to perform inference on-line. Sequential learning provides us with techniques to fuse information, learn policies, analyse risks, forecast outcomes and make decisions in such a way that a current model is updated as new information becomes available. This framework of sequential learning is par...

متن کامل

Possibilistic sequential decision making

Article history: Received 21 March 2013 Received in revised form 14 November 2013 Accepted 15 November 2013 Available online 19 December 2013

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Automatica

سال: 2023

ISSN: ['1873-2836', '0005-1098']

DOI: https://doi.org/10.1016/j.automatica.2022.110795